# Deploying Dagster on Helm, Advanced

## Overview

We extend the [previous Helm deployment guide](/deployment/guides/kubernetes/deploying-with-helm)
by deploying a more complex configuration of Dagster, which utilizes the <PyObject module="dagster_celery_k8s" object="CeleryK8sRunLauncher" />.

## Prerequsites

In addition to the [previous prerequisites](/deployment/guides/kubernetes/deploying-with-helm#prerequsites),
we expect familiarity with [Celery, a distributed task queue system](https://docs.celeryproject.org/en/stable/).

## Deployment Architecture

<!-- https://excalidraw.com/#json=4680957890134016,q6NWURUuPP_VThmbRQ89Jg -->

![dagster-kubernetes-advanced-architecture.png](/images/deploying/dagster-kubernetes-advanced-architecture.png)

### Components

<table>
  <tr style={{ background: "#F8F8F8" }}>
    <th>Component Name</th>
    <th>Type</th>
    <th>Image</th>
  </tr>
  <tr>
    <td>Celery</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/deployment/">
        Deployment
      </a>
    </td>
    <td>
      <a href="https://hub.docker.com/r/dagster/dagster-celery-k8s">
        dagster/dagster-celery-k8s
      </a>{" "}
      <i>(released weekly)</i>
    </td>
  </tr>
  <tr style={{ background: "#F8F8F8" }}>
    <td>Daemon</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/deployment/">
        Deployment
      </a>
    </td>
    <td>
      <a href="https://hub.docker.com/r/dagster/dagster-celery-k8s">
        dagster/dagster-celery-k8s
      </a>{" "}
      <i>(released weekly)</i>
    </td>
  </tr>
  <tr>
    <td>Dagit</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/deployment/">
        Deployment
      </a>{" "}
      behind a{" "}
      <a href="https://kubernetes.io/docs/concepts/services-networking/service/">
        Service
      </a>
    </td>
    <td>
      <a href="https://hub.docker.com/r/dagster/dagster-celery-k8s">
        dagster/dagster-celery-k8s
      </a>{" "}
      <i>(released weekly)</i>
    </td>
  </tr>
  <tr style={{ background: "#F8F8F8" }}>
    <td>Database</td>
    <td>PostgreSQL</td>
    <td>
      <a href="https://hub.docker.com/_/postgres">postgres</a> <i>(Optional)</i>
    </td>
  </tr>
  <tr>
    <td>
      Flower <i>(Optional)</i>
    </td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/deployment/">
        Deployment
      </a>{" "}
      behind a{" "}
      <a href="https://kubernetes.io/docs/concepts/services-networking/service/">
        Service
      </a>
    </td>
    <td>
      <a href="https://hub.docker.com/r/mher/flower">mher/flower</a>
    </td>
  </tr>
  <tr style={{ background: "#F8F8F8" }}>
    <td>Run Worker</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/job/">
        Job
      </a>
    </td>
    <td>
      User-provided or{" "}
      <a href="https://hub.docker.com/r/dagster/user-code-example">
        dagster/user-code-example
      </a>{" "}
      <i>(released weekly)</i>{" "}
    </td>
  </tr>
  <tr>
    <td>Step Job</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/job/">
        Job
      </a>
    </td>
    <td>
      User-provided or{" "}
      <a href="https://hub.docker.com/r/dagster/user-code-example">
        dagster/user-code-example
      </a>{" "}
      <i>(released weekly)</i>{" "}
    </td>
  </tr>
  <tr style={{ background: "#F8F8F8" }}>
    <td>User Code Deployment</td>
    <td>
      <a href="https://kubernetes.io/docs/concepts/workloads/controllers/deployment/">
        Deployment
      </a>{" "}
      behind a{" "}
      <a href="https://kubernetes.io/docs/concepts/services-networking/service/">
        Service
      </a>
    </td>
    <td>
      User-provided or{" "}
      <a href="https://hub.docker.com/r/dagster/user-code-example">
        dagster/user-code-example
      </a>{" "}
      <i>(released weekly)</i>{" "}
    </td>
  </tr>
</table>

### Celery

Dagster uses Celery to provide step level isolation and to limit the number of concurrent connections to a resource.
Users can configure multiple Celery queues (for example, one celery queue
for each resource the user would like to limit) and multiple Celery workers per queue via the
`runLauncher.config.celeryK8sRunLauncher.workerQueues` section of `values.yaml`.

The Celery workers poll for new Celery tasks and execute each task in order of receipt or priority. The Celery task largely
consists of launching an ephemeral Step Job (Kubernetes Job) to execute that step.

Using Celery requires configuring the <PyObject
module="dagster_celery_k8s" object="CeleryK8sRunLauncher" /> and <PyObject module="dagster_celery_k8s" object="celery_k8s_job_executor" />.

### Daemon

Building off the [prior description](/deployment/guides/kubernetes/deploying-with-helm#daemon), but instead, it is configured with the <PyObject module="dagster_celery_k8s" object="CeleryK8sRunLauncher" />.

### Dagit

Building off the [prior description](/deployment/guides/kubernetes/deploying-with-helm#dagit), but instead, it is configured with the <PyObject module="dagster_celery_k8s" object="CeleryK8sRunLauncher" />.

### Database

Same as the [prior description](/deployment/guides/kubernetes/deploying-with-helm#daemon).

### Flower

[Flower](https://flower.readthedocs.io/en/latest/) is an optional component that can be useful for monitoring Celery queues
and workers.

### Run Worker

Building off the [prior description](/deployment/guides/kubernetes/deploying-with-helm#run-worker), the main difference in this deployment is
that the Run Worker submits steps that are ready to be executed to the corresponding Celery queue (instead of executing
the step itself). As before, the Run Worker is responsible for traversing the execution plan.

### Step Job

The Step Job is responsible for executing a single step, writing the structured events to the database. The Celery worker
polls for the Step Job completion.

### User Code Deployment

Same as the [prior description](/deployment/guides/kubernetes/deploying-with-helm#user-code-deployment).

## Walkthrough

We assume that you've followed the initial steps in the [previous walkthrough](/deployment/guides/kubernetes/deploying-with-helm#walkthrough)
by building your docker image for your user code, pushing it to a registry, adding the Dagster Helm
chart repository, and configuring your Helm User Deployment values.

### Configure Persistent Object Storage

We need to configure persistent object storage so that data can be serialized and passed between steps.
To run the Dagster User Code example, create a S3 bucket named "dagster-test".

To enable Dagster to connect to S3, provide `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables via the `env`, `envConfigMaps`, or `envSecrets`
fields under `userDeployments` in `values.yaml` or (not recommended) by setting these variables directly in the User Code Deployment image.

### Install the Dagster Helm Chart

Install the Helm chart and create a release. Below, we've named our release `dagster-release`.
We use `helm upgrade --install` to create the release if it does not exist; otherwise, the
existing `dagster-release` will be modified:

    helm upgrade --install dagster-release dagster/dagster -f /path/to/values.yaml \
      --set runLauncher.type=CeleryK8sRunLauncher \
      --set dagsterDaemon.queuedRunCoordinator.enabled=true \
      --set rabbitmq.enabled=true

Helm will launch several pods. You can check the status of the installation with `kubectl`.
If everything worked correctly, you should see output like the following:

    $ kubectl get pods
    NAME                                                     READY   STATUS      RESTARTS   AGE
    dagster-celery-workers-74886cfbfb-m9cbc                  1/1     Running     1          3m42s
    dagster-daemon-68c4b8d68d-vvpls                          1/1     Running     1          3m42s
    dagster-dagit-69974dd75b-5m8gg                           1/1     Running     0          3m42s
    dagster-k8s-example-user-code-1-88764b4f4-25mbd          1/1     Running     0          3m42s
    dagster-postgresql-0                                     1/1     Running     0          3m42s
    dagster-rabbitmq-0                                       1/1     Running     0          3m42s

### Run a pipeline in your deployment

After Helm has successfully installed all the required kubernetes resources, start port forwarding
to the Dagit pod via:

    export DAGIT_POD_NAME=$(kubectl get pods --namespace default \
      -l "app.kubernetes.io/name=dagster,app.kubernetes.io/instance=dagster,component=dagit" \
      -o jsonpath="{.items[0].metadata.name}")
    kubectl --namespace default port-forward $DAGIT_POD_NAME 8080:80

Visit http://127.0.0.1:8080, navigate to the [playground](http://127.0.0.1:8080/workspace/example_repo@k8s-example-user-code-1/pipelines/example_pipe/playground),
select the `celery_k8s` preset. Notice how `intermediate_storage.s3.config.s3_bucket` is set to `dagster-test`.
You can replace this string with any other accessible S3 bucket. Then, click _Launch Execution_.

You can introspect the jobs that were launched with `kubectl`:

    $ kubectl get jobs
    NAME                                               COMPLETIONS   DURATION   AGE
    dagster-job-9f5c92d1216f636e0d33877560818840       1/1           5s         12s
    dagster-job-a1063317b9aac91f42ca9eacec551b6f       1/1           12s        34s
    dagster-run-fb6822e5-bf43-476f-9e6c-6f9896cf3fb8   1/1           37s        37s

`dagster-job-` entries correspond to Step Jobs and `dagster-run-` entries correspond to Run Workers.

Within Dagit, you can watch pipeline progress live update and succeed!

## Conclusion

We deployed Dagster, configured with the <PyObject module="dagster_celery_k8s" object="CeleryK8sRunLauncher" />,
onto a Kubernetes cluster using Helm.
